dspawpy.analysis package

Submodules

dspawpy.analysis.aimdtools module

class dspawpy.analysis.aimdtools.MSD(structures: List[Structure], select: str | List[int] = 'all', msd_type='xyz')

Bases: object

run()
class dspawpy.analysis.aimdtools.RDF(structures: Structure | List[Structure], rmin: float = 0.0, rmax: float = 10.0, ngrid: float = 101, sigma: float = 0.0)

Bases: object

get_coordination_number(ref_species, species, is_average=True)

returns running coordination number

Parameter

ref_species (list of species or just single specie str):

the reference species. The rdfs are calculated with these species at the center

species (list of species or just single specie str):

the species that we are interested in. The rdfs are calculated on these species.

is_average (bool): whether to take structural average

rtype:

numpy array

get_one_rdf(ref_species_index: str | List[str], species_index: str | List[str], index=0)

Get the RDF for one structure, indicated by the index of the structure in all structures

get_rdf(ref_species: str | List[str], species: str | List[str], is_average=True)

Wrapper to get the rdf for a given species pair

Parameter

ref_species (list of species or just single specie str):

The reference species. The rdfs are calculated with these species at the center

species (list of species or just single specie str):

the species that we are interested in. The rdfs are calculated on these species.

is_average (bool):

whether to take the average over all structures

returns:

x is the radial points, and rdf is the rdf value.

rtype:

(x, rdf)

class dspawpy.analysis.aimdtools.RMSD(structures: List[Structure])

Bases: object

run(base_index=0)
dspawpy.analysis.aimdtools.get_lagtime_msd(datafile: str | List[str], select: str | List[int] = 'all', msd_type: str = 'xyz', timestep: float = None)

计算不同时间步长下的均方位移

Parameters:
  • datafile (str or list of str) --

    • aimd.h5/aimd.json文件路径或包含这些文件的文件夹路径(优先寻找aimd.h5)

    • 写成列表将依次读取数据并合并到一起

    • 例如['aimd1.h5', 'aimd2.h5', '/data/home/my_aimd_task']

  • select (str or list of int) -- 选择原子序号或元素,原子序号从0开始;默认为'all',计算所有原子

  • msd_type (str) -- 计算MSD的类型,可选xyz,xy,xz,yz,x,y,z,默认为'xyz',即计算所有分量

  • timestep (float) -- 相邻结构的时间间隔,单位为fs,默认None,将从datafile中读取,失败则设为1.0fs; 若不为None,则将使用该值计算时间序列

Returns:

  • lagtime (np.ndarray) -- 时间序列

  • result (np.ndarray) -- 均方位移序列

Examples

>>> from dspawpy.analysis.aimdtools import get_lagtime_msd
>>> lagtime, msd = get_lagtime_msd(datafile='/data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5')
Reading /data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5...
Calculating MSD...
>>> lagtime
array([0.000e+00, 1.000e+00, 2.000e+00, ..., 1.997e+03, 1.998e+03,
       1.999e+03])
>>> msd
array([0.00000000e+00, 8.80447550e-02, 1.83768134e-01, ...,
       9.66185452e+02, 9.66812755e+02, 9.67357702e+02])
>>> lagtime, msd = get_lagtime_msd(datafile='/data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5', select='H')
Reading /data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5...
Calculating MSD...
>>> lagtime, msd = get_lagtime_msd(datafile='/data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5', select=[0,1])
Reading /data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5...
Calculating MSD...
>>> lagtime, msd = get_lagtime_msd(datafile='/data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5', select=['H','O'])
Reading /data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5...
Calculating MSD...
>>> lagtime, msd = get_lagtime_msd(datafile='/data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5', select=0)
Reading /data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5...
Calculating MSD...
dspawpy.analysis.aimdtools.get_lagtime_rmsd(datafile: str | List[str], timestep: float = None)
Parameters:
  • datafile (str or list of str) --

    • aimd.h5/aimd.json文件路径或包含这些文件的文件夹路径(优先寻找aimd.h5)

    • 写成列表将依次读取数据并合并到一起

    • 例如['aimd1.h5', 'aimd2.h5', '/data/home/my_aimd_task']

  • timestep (float) -- 相邻结构的时间间隔,单位为fs,默认None,将从datafile中读取,失败则设为1.0fs; 若不为None,则将使用该值计算时间序列

Returns:

  • lagtime (numpy.ndarray) -- 时间序列

  • rmsd (numpy.ndarray) -- 均方根偏差序列

Examples

>>> from dspawpy.analysis.aimdtools import get_lagtime_rmsd
>>> lagtime, rmsd = get_lagtime_rmsd(datafile='/data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5', timestep=0.1)
Reading /data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5...
Calculating RMSD...
>>> lagtime
array([0.000e+00, 1.000e-01, 2.000e-01, ..., 1.997e+02, 1.998e+02,
       1.999e+02])
>>> rmsd
array([ 0.        ,  0.04849931,  0.09181907, ..., 31.09991413,
       31.10077061, 31.10237454])
dspawpy.analysis.aimdtools.get_rs_rdfs(datafile: str | List[str], ele1: str, ele2: str, rmin: float = 0, rmax: float = 10, ngrid: float = 101, sigma: float = 0)

计算rdf分布函数

Parameters:
  • datafile (str or list of str) --

    • aimd.h5/aimd.json文件路径或包含这些文件的文件夹路径(优先寻找aimd.h5)

    • 写成列表将依次读取数据并合并到一起

    • 例如['aimd1.h5', 'aimd2.h5', '/data/home/my_aimd_task']

  • ele1 (list) -- 中心元素

  • ele2 (list) -- 相邻元素

  • rmin (float) -- 径向分布最小值,默认为0

  • rmax (float) -- 径向分布最大值,默认为10

  • ngrid (int) -- 径向分布网格数,默认为101

  • sigma (float) -- 平滑参数

Returns:

  • r (numpy.ndarray) -- 径向分布网格点

  • rdf (numpy.ndarray) -- 径向分布函数

Examples

>>> from dspawpy.analysis.aimdtools import get_rs_rdfs
>>> rs, rdfs = get_rs_rdfs(datafile='/data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5',ele1='H',ele2='O')
Reading /data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5...
Calculating RDF...
>>> rdfs
array([0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       6.90409153e-05, 7.47498606e-04, 2.40555636e-03, 2.57666881e-03,
       2.53257125e-03, 2.37964722e-03, 1.80028135e-03, 1.20224160e-03,
       7.57927477e-04, 3.41004760e-04, 2.85680407e-04, 1.72795412e-04,
       1.49718401e-04, 1.44477999e-04, 2.23702566e-04, 2.11836218e-04,
       1.34146448e-04, 1.78839343e-04, 9.16910363e-05, 1.45492991e-05,
       3.70494493e-05, 3.72784113e-05, 2.82725837e-05, 3.19750004e-05,
       8.46456322e-07, 4.72401470e-06, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00])
dspawpy.analysis.aimdtools.plot_msd(lagtime: ndarray, result: ndarray, xlim: List[float] = None, ylim: List[float] = None, figname: str = None, show: bool = True, ax=None, **kwargs)

AIMD任务完成后,计算均方位移(MSD)

Parameters:
  • lagtime (np.ndarray) -- 时间序列

  • result (np.ndarray) -- 均方位移序列

  • xlim (list of float) -- x轴的范围,默认为None,自动设置

  • ylim (list of float) -- y轴的范围,默认为None,自动设置

  • figname (str) -- 图片名称,默认为None,不保存图片

  • show (bool) -- 是否显示图片,默认为True

  • ax (matplotlib axes object) -- 用于将图片绘制到matplotlib的子图上

  • **kwargs (dict) -- 其他参数,如线条宽度、颜色等,传递给plt.plot函数

Return type:

MSD分析后的图片

Examples

>>> from dspawpy.analysis.aimdtools import get_lagtime_msd, plot_msd

指定h5文件位置,用 get_lagtime_msd 函数获取数据,select 参数选择第n个原子(不是元素)

>>> lagtime, msd = get_lagtime_msd('/data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5', select=[0])
Reading /data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5...
Calculating MSD...

用获取的数据画图并保存

>>> plot_msd(lagtime, msd, figname='/data/home/hzw1002/dspawpy_repo/test/out/MSD.png', show=False)
==> /data/home/hzw1002/dspawpy_repo/test/out/MSD.png
<Axes: xlabel='Time (fs)', ylabel='MSD ($Å^2$)'>
dspawpy.analysis.aimdtools.plot_rdf(rs: ndarray, rdfs: ndarray, ele1: str, ele2: str, xlim: list = None, ylim: list = None, figname: str = None, show: bool = True, ax: Axes = None, **kwargs)

AIMD计算后分析rdf并画图

Parameters:
  • rs (numpy.ndarray) -- 径向分布网格点

  • rdfs (numpy.ndarray) -- 径向分布函数

  • ele1 (list) -- 中心元素

  • ele2 (list) -- 相邻元素

  • xlim (list) -- x轴范围,默认为None,即自动设置

  • ylim (list) -- y轴范围,默认为None,即自动设置

  • figname (str) -- 图片名称,默认为None,即不保存图片

  • show (bool) -- 是否显示图片,默认为True

  • ax (matplotlib.axes.Axes) -- 画图的坐标轴,默认为None,即新建坐标轴

  • **kwargs (dict) -- 其他参数,如线条宽度、颜色等,传递给plt.plot函数

Return type:

rdf分析后的图片

Examples

>>> from dspawpy.analysis.aimdtools import get_rs_rdfs, plot_rdf

先获取rs和rdfs数据作为xy轴数据

>>> rs, rdfs = get_rs_rdfs('/data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5', 'H', 'O', rmax=6)
Reading /data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5...
Calculating RDF...

将xy轴数据传入plot_rdf函数绘图

>>> plot_rdf(rs, rdfs, 'H','O', figname='/data/home/hzw1002/dspawpy_repo/test/out/RDF.png', show=False)
==> /data/home/hzw1002/dspawpy_repo/test/out/RDF.png
dspawpy.analysis.aimdtools.plot_rmsd(lagtime: ndarray, result: ndarray, xlim: list = None, ylim: list = None, figname: str = None, show: bool = True, ax=None, **kwargs)

AIMD计算后分析rmsd并画图

Parameters:
  • lagtime -- 时间序列

  • result -- 均方根偏差序列

  • xlim (list) -- x轴范围

  • ylim (list) -- y轴范围

  • figname (str) -- 图片保存路径

  • show (bool) -- 是否显示图片

  • ax (matplotlib.axes._subplots.AxesSubplot) -- 画子图的话,传入子图对象

  • **kwargs (dict) -- 传入plt.plot的参数

Return type:

rmsd分析结构的图片

Examples

>>> from dspawpy.analysis.aimdtools import get_lagtime_rmsd, plot_rmsd

timestep 表示时间步长

>>> lagtime, rmsd = get_lagtime_rmsd(datafile='/data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5', timestep=0.1)
Reading /data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5...
Calculating RMSD...

直接保存为RMSD.png图片

>>> plot_rmsd(lagtime, rmsd, figname='/data/home/hzw1002/dspawpy_repo/test/out/RMSD.png', show=False)
==> /data/home/hzw1002/dspawpy_repo/test/out/RMSD.png
<Axes: xlabel='Time (fs)', ylabel='RMSD (Å)'>

dspawpy.analysis.vacf module

This module has not been tested yet, use at your own risk!

dspawpy.analysis.vacf.mirror(arr, axis=0)

Mirror array arr at index 0 along axis. The length of the returned array is 2*arr.shape[axis]-1 .

dspawpy.analysis.vacf.norm_int(y, x, area=1.0, scale=True, func=<function simps>)

Normalize integral area of y(x) to area. :param x: :type x: numpy 1d arrays :param y: :type y: numpy 1d arrays :param area: :type area: float :param scale: Scale x and y to the same order of magnitude before integration.

This may be necessary to avoid numerical trouble if x and y have very different scales.

Parameters:

func (callable) -- Function to do integration (like scipy.integrate.{simps,trapz,...} Called as func(y,x). Default: simps

Return type:

scaled y

Notes

The argument order y,x might be confusing. x,y would be more natural but we stick to the order used in the scipy.integrate routines.

dspawpy.analysis.vacf.pad_zeros(arr, axis=0, where='end', nadd=None, upto=None, tonext=None, tonext_min=None)

Pad an nd-array with zeros. Default is to append an array of zeros of the same shape as arr to arr's end along axis. :param arr: :type arr: nd array :param axis: :type axis: the axis along which to pad :param where: start ("prepend to array") of axis :type where: string {'end', 'start'}, pad at the end ("append to array") or :param nadd: of an 1d array) :type nadd: number of items to padd (i.e. nadd=3 means padd w/ 3 zeros in case :param upto: :type upto: pad until arr.shape[axis] == upto :param tonext: array has a length of power of two) :type tonext: bool, pad up to the next power of two (pad so that the padded :param tonext_min: power of two for which the resulting array length along axis is at

least tonext_min; the default is tonext_min = arr.shape[axis]

Parameters:
  • nadd (Use only one of) --

  • upto --

  • tonext. --

Return type:

padded array

Examples

>>> # 1d
>>> pad_zeros(a)
array([1, 2, 3, 0, 0, 0])
>>> pad_zeros(a, nadd=3)
array([1, 2, 3, 0, 0, 0])
>>> pad_zeros(a, upto=6)
array([1, 2, 3, 0, 0, 0])
>>> pad_zeros(a, nadd=1)
array([1, 2, 3, 0])
>>> pad_zeros(a, nadd=1, where='start')
array([0, 1, 2, 3])
>>> # 2d
>>> a=arange(9).reshape(3,3)
>>> pad_zeros(a, nadd=1, axis=0)
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8],
       [0, 0, 0]])
>>> pad_zeros(a, nadd=1, axis=1)
array([[0, 1, 2, 0],
       [3, 4, 5, 0],
       [6, 7, 8, 0]])
>>> # up to next power of two
>>> 2**arange(10)
array([  1,   2,   4,   8,  16,  32,  64, 128, 256, 512])
>>> pydos.pad_zeros(arange(9), tonext=True).shape
(16,)
dspawpy.analysis.vacf.pdos(vel, dt=1.0, m=None, full_out=False, area=1.0, window=True, npad=None, tonext=False, mirr=False, method='direct')

Phonon DOS by FFT of the VACF or direct FFT of atomic velocities.

Integral area is normalized to area. It is possible (and recommended) to zero-padd the velocities (see npad).

Parameters:
  • vel (3d array (nstep, natoms, 3)) -- atomic velocities

  • dt (time step) --

  • m (1d array (natoms,),) -- atomic mass array, if None then mass=1.0 for all atoms is used

  • full_out (bool) --

  • area (float) -- normalize area under frequency-PDOS curve to this value

  • window (bool) -- use Welch windowing on data before FFT (reduces leaking effect, recommended)

  • npad ({None, int}) -- method='direct' only: Length of zero padding along axis. npad=None = no padding, npad > 0 = pad by a length of (nstep-1)*npad. npad > 5 usually results in sufficient interpolation.

  • tonext (bool) -- method='direct' only: Pad vel with zeros along axis up to the next power of two after the array length determined by npad. This gives you speed, but variable (better) frequency resolution.

  • mirr (bool) -- method='vacf' only: mirror one-sided VACF at t=0 before fft

Returns:

  • if full_out = False -- | (faxis, pdos) | faxis : 1d array [1/unit(dt)] | pdos : 1d array, the phonon DOS, normalized to area

  • if full_out = True -- | if method == 'direct': | (faxis, pdos, (full_faxis, full_pdos, split_idx)) | if method == 'vavcf': | (faxis, pdos, (full_faxis, full_pdos, split_idx, vacf, fft_vacf)) | fft_vacf : 1d complex array, result of fft(vacf) or fft(mirror(vacf)) | vacf : 1d array, the VACF

Notes

padding (only method='direct'): With npad we pad the velocities vel with npad*(nstep-1) zeros along axis (the time axis) before FFT b/c the signal is not periodic. For npad=1, this gives us the exact same spectrum and frequency resolution as with pdos(..., method='vacf',mirr=True) b/c the array to be fft'ed has length 2*nstep-1 along the time axis in both cases (remember that the array length = length of the time axis influences the freq. resolution). FFT is only fast for arrays with length = a power of two. Therefore, you may get very different fft speeds depending on whether 2*nstep-1 is a power of two or not (in most cases it won't). Try using tonext but remember that you get another (better) frequency resolution.

References

[1] Phys Rev B 47(9) 4863, 1993

See also

pwtools.signal.fftsample(), pwtools.signal.acorr(), direct_pdos(), vacf_pdos()

dspawpy.analysis.vacf.slicetake(a, sl, axis=None, copy=False)

The equivalent of numpy.take(a, ..., axis=<axis>), but accepts slice objects instead of an index array. Also by default, it returns a view and no copy. :param a: :type a: numpy ndarray :param sl:

axis=<int>

one slice object for that axis

axis=None

sl is a list or tuple of slice objects, one for each axis. It must index the whole array, i.e. len(sl) == len(a.shape).

Parameters:
  • axis ({None, int}) --

  • copy (bool, return a copy instead of a view) --

Return type:

A view into a or copy of a slice of a.

Examples

>>> from numpy import s_
>>> a = np.random.rand(20,20,20)
>>> b1 = a[:,:,10:]
>>> # single slice for axis 2
>>> b2 = slicetake(a, s_[10:], axis=2)
>>> # tuple of slice objects
>>> b3 = slicetake(a, s_[:,:,10:])
>>> (b2 == b1).all()
True
>>> (b3 == b1).all()
True
>>> # simple extraction too, sl = integer
>>> (a[...,5] == slicetake(a, 5, axis=-1))
True
dspawpy.analysis.vacf.sum(arr, axis=None, keepdims=False, **kwds)

This numpy.sum() with some features implemented which can be found in numpy v1.7 and later: axis can be a tuple to select arbitrary axes to sum over. We also have a keepdims keyword, which however works completely different from numpy. Docstrings shamelessly stolen from numpy and adapted here and there. :param arr: :type arr: nd array :param axis: Axis or axes along which a sum is performed. The default (axis =

None) is to perform a sum over all the dimensions of the input array. axis may be negative, in which case it counts from the last to the first axis. If this is a tuple of ints, a sum is performed on multiple axes, instead of a single axis or all the axes as before.

Parameters:
  • keepdims (bool, optional) -- If this is set to True, the axes from axis are left in the result and the reduction (sum) is performed for all remaining axes. Therefore, it reverses the axis to be summed over.

  • **kwds (passed to np.sum().) --

Examples

>>> a=rand(2,3,4)
>>> num.sum(a)
12.073636268676152
>>> a.sum()
12.073636268676152
>>> num.sum(a, axis=1).shape
(2, 4)
>>> num.sum(a, axis=(1,)).shape
(2, 4)
>>> # same as axis=1, i.e. it inverts the axis over which we sum
>>> num.sum(a, axis=(0,2), keepdims=True).shape
(2, 4)
>>> # numpy's keepdims has another meaning: it leave the summed axis (0,2)
>>> # as dimension of size 1 to allow broadcasting
>>> numpy.sum(a, axis=(0,2), keepdims=True).shape
(1, 3, 1)
>>> num.sum(a, axis=(1,)) - num.sum(a, axis=1)
array([[ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])
>>> num.sum(a, axis=(0,2)).shape
(3,)
>>> num.sum(a, axis=(0,2)) - a.sum(axis=0).sum(axis=1)
array([ 0.,  0.,  0.])
dspawpy.analysis.vacf.vacf(vel, m=None, method=3)

Reference implementation for calculating the VACF of velocities in 3d array vel. This is slow. Use for debugging only. For production, use fvacf().

Parameters:
  • vel (3d array, (nstep, natoms, 3)) -- Atomic velocities.

  • m (1d array (natoms,)) -- Atomic masses.

  • method (int) --

    1 : 3 loops
    2 : replace 1 inner loop
    3 : replace 2 inner loops

Returns:

c -- VACF

Return type:

1d array (nstep,)

dspawpy.analysis.vacf.welch(M, sym=1)

Welch window. Function skeleton shamelessly stolen from scipy.signal.bartlett() and others.